Search Query Syntax
KEY: 1. Lower case words are meta-definitions (i.e. non-terminals defined in terms of terminals and other non-terminals). 2. UPPER case words are terminal symbol KEY or reserved words. 3. '|' implies "OR" if it begins a line. A character enclosed in single quotes (' ') is a literal terminal symbol of that single character. The '*' symbol means the previous object repeated 0 or more times. All other punctuation are literal terminal symbols. 4. A definition in comments /* ... */ is an English explanation or a regular expression definition. --------------------------------------------------------------------- query ::= set_query | set_query SET range-list set-query ::= term | set-query AND term | set-query BUTNOT term term ::= field-item | & field-item | term OR field-item | & term OR field-item field-item ::= compare-condition | between-condition | proximate-condition compare-condition ::= field-spec comp-op word comp-op ::= = | != | < | <= | > | >= between-condition ::= field-spec BETWEEN word , word | field-spec OUTSIDE word , word proximate-condition ::= phrase-term | phrase-term PROXIMITY distance distance ::= constant group-unit group-unit ::= WORD[S] | SENTENCE[S] | PARAGRAPH[S] | DOCUMENT[S] phrase-term ::= phrase-list phrase-term phrase-term-op phrase-list phrase-term-op ::= + | ~ phrase-list ::= field-phrase | phrase-list , field-phrase field-phrase ::= phrase | field-spec: phrase | phrase IN field-spec phrase ::= phrase-item ( set-query ) phrase-item ::= approx-word | phrase-item order-op approx-word order-op ::= ' ' | - field-spec ::= DICTIONARY constant field-list | DRI constant field-list | XPATH path-spec | TAG path-spec | field-list path-spec ::= tag-spec "tag-spec" tag-spec ::= //tag-list /tag-list tag-list tag-list ::= tag @tag tag-list / tag field-list ::= ALL | FIELD[S] aalist-spec aalist-spec ::= aalist-item aalist-spec, aalist-item aalist-item ::= constant | ~ constant | constant[aaval-spec] | ~ constant[aaval-spec] aaval-spec ::= aaval-spec-item | aaval-spec , aaval-spec-item aaval-spec-item ::= constant | constant .. constant approx-word ::= word | @ word word ::= real-word | "exact-order-phrase" | 'literal-phrase' real-word ::= numeric | id | id'*' | id'?'* range-list ::= constant-list | BETWEEN constant, constant | OUTSIDE constant, constant constant-list ::= constant | constant-list , constant exact-order-phrase ::= real-word | real-word' 'exact-order-phrase literal-phrase ::= /* Any character, including blanks and dashes */ constant ::= /* [0-9]+ */ id ::= /* [A-Z][A-Z0-9_]* , or if in quotes, can be any non-blank character */ numeric ::= /* constant or floating pt. # (e.g. 1.23) */ tag ::= /* XML generic ID */ ------------------------------------------------------------------------
EXPLANATIONS:
A. A <real-word> definition can vary from database to database. Right truncation style wildcards are supported along with embedded wildcards (i.e. one or more wildcard characters along with specific characters). The representation may vary according to word definition, but will support the following level of functionality: Form 1: xxx* <- matches ALL words starting with "xxx" Form 2: xxx?? <- matches ALL words starting with "xxx" and are of length <= 5. Form 3: xx*yy <- matches ALL words starting with "xx" and ending in "yy". Form 4: x??y* <- matches ALL words starting with "x", followed by two arbitrary chracters, then a "y", then ending with 0 or more extra characters. B. A <constant> is an integer (e.g. 47). C. A quoted word is not interpreted, so anything inside will constitute a word except for a blank (' '). Blanks inside a double quoted string will be construed as a separator denoting a string of words to find in exact order. Some implementations may have this feature turned off. Strings inside a single quote phrase have NO characters interpreted, including the blank (' ') character. D. A NOT condition can not form a single term. E. Sentences may or may not be implemented in a specific application. F. The '&' symbol is an anchor. It forces a term to be evaluated first instead of the MIN-term order. G. The @ symbol means to find the word CLOSEST to the given word. H. The "~" symbol means the NOT or EXCLUDE operator. I. Note the difference between the 'AND' operator and '+'. The AND operator applies to an entire document whereas the '+' operator is constrained by the PROXIMITY expression. If no proximity is applied, then the two operators are equivalent.
EXAMPLES
Note that in the following examples, #defines are used to give meaning to fields and are NOT part of the query syntax.
More XML XPATH examples:
Given the XML markup: <employee-record> <identification> <name nickname = "buddy"> <first>joe</first><middle>bob</middle><last>thornton</last> </name> <ss number = 366546660></ss> <age>47</age> </identification> <status> disabled </status> <salary> 40000 <salary> </employee-record> The following example searches use path notation to describe the record: